Stratified random sampling from streaming and stored data

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variance-Optimal Offline and Streaming Stratified Random Sampling

Stratified random sampling (SRS) is a fundamental sampling technique that provides accurate estimates for aggregate queries using a small size sample, and has been used widely for approximate query processing. A key question in SRS is how to partition a target sample size among different strata. While Neyman allocation provides a solution that minimizes the variance of an estimate using this sa...

متن کامل

Scalable Simple Random Sampling and Stratified Sampling

Analyzing data sets of billions of records has now become a regular task in many companies and institutions. In the statistical analysis of those massive data sets, sampling generally plays a very important role. In this work, we describe a scalable simple random sampling algorithm, named ScaSRS, which uses probabilistic thresholds to decide on the fly whether to accept, reject, or wait-list an...

متن کامل

Improved Exponential Estimator in Stratified Random Sampling

In this article we have considered the problem of estimating the population mean   Y in the stratified random sampling using the information of an auxiliary variable x which is correlated with y and suggested improved exponential ratio estimators in the stratified random sampling. The mean square error (MSE) equations for the proposed estimators have been derived and it is shown that the prop...

متن کامل

Stratified and Un-stratified Sampling in Data Mining: Bagging

Stratified sampling is often used in opinion polls to reduce standard errors, and it is known as variance reduction technique in sampling theory. The most common approach of resampling method is based on bootstrapping the dataset with replacement. A main purpose of this work is to investigate extensions of the resampling methods in classification problems, specifically we use decision trees, fr...

متن کامل

Interval Estimation for Small Area Proportions with Small True Proportions from Stratified Random Sampling Survey Data∗†

Consider interval estimation of m small area proportions Pi (i = 1, · · · ,m), where we assume a stratified random sampling design with equal number of observations n in each stratum, and where the domains of interest are the strata. A 100(1 − α)% confidence interval for Pi that has appeared repeatedly in the literature and is used in application is given by P̂ i ± zα/2 √ msei, where P̂ i and mse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Distributed and Parallel Databases

سال: 2020

ISSN: 0926-8782,1573-7578

DOI: 10.1007/s10619-020-07315-w